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Abstract 

A hierarchy on a set S, also called a total partition of S, is a collection H of sub- 
sets of S such that S £%, each singleton subset of 5* belongs to H, and ii A, B £ H 
then Af]B equals either A or B or 0. Every exchangeable random hierarchy of 
positive integers has the same distribution as a random hierarchy H associated as 
follows with a random real tree T equipped with root element and a random 
probability distribution p on the Borcl subsets of T: given (T,p), let t\,t2, ■ ■ ■ be 
independent and identically distributed according to p, and let H comprise all sin- 
gleton subsets of N, and every subset of the form {j : tj G F^} as x ranges over T, 
where Fx is the fringe subtree of T rooted at x. There is also the alternative charac- 
terization: every exchangeable random hierarchy of positive integers has the same 
distribution as a random hierarchy H derived as follows from a random hierarchy ,Jf 
on [0, 1] and a family ([/, ) of IID uniform [0,1] random variables independent of J^: 
let "H comprise all sets of the form {j : Uj € B} as B ranges over the members of Jif. 

AMS 2010 subject classifications: 60E99, 60F99, 62E10, 62B05. 
Keywords: exchangeable hierarchy, partition, random composition, fragmentation, 
continuum random tree, weighted real tree 



1 Background 

Definition 1.1. A hierarchy on a finite set 5 is a collection H of subsets of S such that 

(a) ii A, B G H then Ad B equals either A or B or 0, and 

(b) S€H,{s}€H for all s e 5, and e H. 

Hierarchies are known by several other names, including total partitions and laminar 
families. For brevity we use the term hierarchy throughout the paper. If "H is a hierarchy 
on a finite set S and Sq C S then the restriction ofH to Sq is the hierarchy on 5*0 defined 
as follows: 

n := {HnSo: H en}. (1) 

So 
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Definition 1.2. A hierarchy on N is a sequence [T-Lmn > 1) where for each n, T-Ln is a 
hierarchy on [n], and for every n, T-L^ = 'Hn+i 

[n] 

Less formally, a hierarchy describes a scheme for recursively partitioning a set S into 
finer and finer subsets, down to singletons. If S is finite, this has an elementary meaning: 
5' is partitioned into some set of blocks, then recursively: each non-singleton block that 
remains is partitioned into further blocks, until only singletons remain, and the hierarchy 
is the entire collection of sets that ever appear in this process. If S is infinite, matters 
can be more complex: a continuous recursive process of splitting may be involved, as in 
Bertoin's theory of self-similar or homogeneous fragmentation processes |14|, I15| which 
have a natural regenerative structure. Alternatively, a hierarchy describes a process of 
coalescence, wherein the singleton subsets of S recursively coagulate to reconstitute the 
set S. We emphasize that time plays no role in our definition of a hierarchy: a hierarchy 
H encodes the contents of the blocks of some process of fragmentation (or coagulation) , 
but does not include any additional information about the order in which these blocks 
appear in this fragmentation (or coagulation) process. 




Figure 1: The tree on the right is the graph of H = {{1, 2, 4}} U ^([n]). The other trees 
are the graphs of H 



and V. 

[2] 



. The trivial hierarchy ^(W) is defined at (3 1 

[3] 



Hierarchies on [n] are in bijective correspondence with certain trees. Explicitly, if T 
is a tree 

• with n leaves, each labeled by a distinct element of [n], 

• and having a distinguished vertex called the root, which is not a leaf, 

• with no internal vertices of degree two, except possibly the root 

• and no edge lengths or planar embedding 
then the map 

T A {{j e [n] : V on path from leaf j to root} : v € F(T)} U {0} 

sends T to a hierarchy on [n]. Here, ^(T) denotes the set of vertices of T (including root 
and leaves). The map 5 is a bijection, and we we say that T is the graph of the hierarchy 
5(T). 

Random hierarchies of both finite and infinite sets arise naturally in a number of 
applications, including stochastic models for phylogenetic trees [S71 EOl El [Ml [2^1 El El 
[511 EH [58], processes of fragmentation and coalescence [40 l [3l[n i [T2 l [T3 l [T6l[T7l[T8 l 
[I9l|20l[26ll^[32l[35l[40l[42], and statistics and machine learning [ll|22l [H [59l [21] . 
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In these applications, the object of common interest is a rooted tree which describes 
evolutionary relationships (in the case of phylogentic trees) or the manner in which 
an object fragments into smaller pieces (in the case of models of fragmentation) or 
some notion of class membership (in the case of hierarchical clustering). Such trees 
sometimes have edges equipped with lengths that measure the time between speciations, 
or the amount of time between fragmentation events, or some measure of dissimilarity 
or distance between classes, but the hierarchies we consider correspond with trees of this 
type without edge lengths; see the remark in Section [Sj 

Permutations act on hierarchies by relabeling the contents of constituent sets: if H 
is a hierarchy on [n] and a a permutation of [n] , then 

cr{H) {{<7{h) : h e H} : H e H}. 

An exchangeable hierarchy ok N is a random hierarchy ('H„,n > 1) on N for which for 
every n and every permutation ct of n there is the distributional equality 

fT(H„)-H„. (2) 

The purpose of this paper is to provide a de Finetti-type characterization of exchangeable 
random hierarchies on N. Theorem [T] states that every exchangeable random hierarchy 
on N is derived as if by sampling IID points {tj,j > 1) from a random measure /i 
supported by a random real tree T: the blocks of Hn are the sets of the form {j g [n] : 
tj ^ Fx} as a; ranges over T, where is the fringe subtree of T rooted at x. Real trees 



are tree- like metric spaces that are briefly discussed in Section 3.1 for a more complete 
treatment see [33! and references therein. Theorem |3] is an alternate characterization: 
every exchangeable hierarchy is derived as if from a sequence (Uj) of IID uniform[0,l] 
random variables and an independent random hierarchy on [0,1]: the blocks of Tin 
are the sets of the form {j G [n] : Uj G B} as B ranges over elements of Jif. That 
is a random hierarchy on [0,1] means simply that Jif is a random collection of subsets 
of [0,1] that satisfies (a) and (b) of Definition |1.1| with [0,1] in place of S. For some 
measure theory details concerning random hierarchies on [0,1] see the Remark at the 
end of Section [5] 

As indicated in [TO], an exchangeable hierarchy (H„) of the set of positive integers N 
is generated by each of Bertoin's homogeneous fragmentation processes, and associated 
with each of Bertoin's homogeneous fragmentations there is a one-parameter family of 
self-similar fragmentations, each obtained from the homogeneous fragmentation by a 
suitable family of random time changes, and each generating the same random hierarchy 
(Tin) on N. An attractive feature of the self-similar fragmentations of index a < is 
that each sample path of such a fragmentation is associated with a compact real tree 
|41j . The sample paths of Kingman's coalescent ,57] can likewise be naturally identified 
with a compact real tree |32]. There has been considerable interest in describing real 
tree limits of discrete trees with edge-lengths [Ml HSj |42l |68l ES] , and Theorem [l] of this 
paper is in a similar vein. 

This work forms part of a growing list of characterizations of infinite exchangeable 
combinatorial objects by de Finetti-type theorems. For example, Kingman character- 
ized exchangeable partitions of N [SB], Donelly and Joyce and Gnedin characterized 
composition structures [38l [28] , Janson characterized exchangeable posets [50] and Hirth 
characterized exchangeable ordered trees [4^ , about which we will say a few words below. 
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Many related de Finetti-type theorems are known [51 [M l [57 1 [5 ^ HS ] . 

and there are excellent treatments in [3 [S3] of related material. Such de Finetti-type 
results are often proved via reverse martingale convergence arguments, similar in spirit 
to the modern approach to de Finetti's theorem in [31, Chapter 4]. Alternate approaches 
use harmonic analysis [33 1351 HSl [SB] or isometrics of [S] , or Choquet theory [3S] . The 
results of this paper are proved using a third approach, the key idea of which is to encode 
an exchangeable hierarchy using a binary array, show that this array inherits exchange- 
ability from the hierarchy, and apply well-known characterization theorems for arrays. 
A similar approach was first used by Aldous, who simplified of Kingman's proof char- 
acterizing of exchangeable partitions of N by encoding such partitions as exchangeable 
sequences of real random variables [7] . 

There are several papers on related topics. In [51 Theorem 3] it is shown that if 
{TZ{k), fc > 1) is a consistent family of exchangeable trees with edge lengths that is leaf- 
tight then {Tl{k), fc > 1) is derived as if by sampling from a random real tree. (Aldous 
also assumes that his trees are binary, but this assumption is not essential to his proof.) 
Since a hierarchy on N corresponds to a sequence of consistent trees without edge lengths, 
the main result of this paper can be seen as a variation on this result of Aldous, showing 
that leaf-tightness (and indeed any pre-defined notion of distance) is not needed to obtain 
a de Finetti type theorem for trees with exchangeable leaves. 

In [in], it is shown that every exchangeable V-coalescent process corresponds to a 
unique flow of bridges. An exchangeable P— coalescent process is a Markov process 
(lit, i > 0) whose state space V is the set of partitions of N, for which lit is an exchange- 
able partition of N for every t > whose increments are independent and stationary, if 
the notion of "increments" of a T'-valued function is properly understood. This provides 
a de Finetti-type characterization of exchangeable coalescents. One may "forget" time 
by setting H := {B C N : S G lit for some t > 0}U {N} and thereby obtain an exchange- 
able hierarchy H on N (the notation B G lit means that i? is a block in the partition 
llf). The results of Bertoin and Le Gall in [T^] therefore provide a de Finetti-type char- 
acterization of hierarchies that arise in this manner from exchangeable coalescents. Due 
to the stationary, independent increments property, this class of hierarchies is far from 
including every exchangeable hierarchy, so the present work may be seen as extending 
the results of Bertoin and Le Gall. 

Haas and Miermont [31] provide a de Finetti-type representation of self-similar frag- 
mentations of index a < that have no erosion or sudden loss of mass in terms of 
continuum trees {T,p) as follows: every such fragmentation {F{t),t > 0) is derived as if 
from a continuum tree (T^p) by setting F(t) equal to the decreasing sequence of masses of 
connected components of {i € T : ht{v) > t} where ht(t) denotes the distance from t to 
the root of T. This is proved by introducing a family (i?(fc), /c > 1) of trees derived from 
an associated P-fragmentation (lit) whose sequence of ranked limit frequencies equals 
(F(t)). Distances in these trees R{k) are related to times between dislocations in F{t), 
and by using self-similarity the leaf-tight criterion of [2] is checked. The existence of the 
representing tree {T^p) is then a consequence of the aforementioned theorem of Aldous. 
This provides a de Finetti-type theorem for self-similar fragmentations. 

In [46| . Hirth considers exchangeable ordered trees, which in our terms are exchange- 
able hierarchies T-L on'H for which every element B G T-l besides N there is an associated 
pair of nonnegative integer- valued times {Nb,Mb) which are the times at which B is 
"born" and the times at which B "dies." There is also a partial order on such blocks B 



4 



that is unimportant for our purposes. At the instant of its death, B gives birth to subsets 
born at that instant, whose union is B. Hirth provides a de Finetti-type characteriza- 
tion of exchangeable ordered trees using harmonic analysis techniques. Our hierarchies 
are more general than Hirth's trees, since there is no "discrete time" associated to the 
elements of a hierarchy. Our results may therefore be seen as an extension of Hirth's 
result using probabilistic techniques instead of harmonic analysis. 

2 Results 

This section provides some basic definitions and a statement of the main results of the 
paper. 

For arbitrary sets S we define 

E{S):={S}U{{s}:seS}U{0}. (3) 

We call S([n]) the trivial hierarchy on [n]; it is the smallest hierarchy on [n] and we will 
refer to it numerous times throughout the paper. 

If 7~ is a rooted real tree and (t„, n € N) a deterministic or random sequence of points 
of the hierarchy derived from T and {tn,ri € N) is the sequence ("Hn,/! > 1) defined 

by 

nn {{j e [n] : t, e F.,{T)} : x G T} U (4) 
where F,j.{T) is the fringe subtree ofT rooted at a;, 

Fx{T) = {t/ G T : a; is in the geodesic path in T from y to the root of T}, (5) 

Real trees are tree- like metric spaces discussed in more detail in Section [3. 1| 

Recall that a random measure p is said to direct a family (i„ , n S N) if random 
elements if conditionally given p, (tn , n G N) is an IID family with distribution p. 

Theorem 1. // (T-lrnn > 1) is o-n exchangeable hierarchy on N then there is a {'Hn)- 
measurable triple (("H^,?! > 1), (T,p), {ti,t2, ■ ■ ■)), where T is a random real tree, p is a 
random probability measure with support contained in T almost surely, (^1,^2, • • ■) is an 
exchangeable sequence directed by p, and (HJJ is a hierarchy both equal in distribution to 
{Hn) o,nd equal almost surely to the hierarchy derived from T and the samples (ti, t2, ■ ■ ■)■ 

Theorem [1] is the main result of the paper, proved in Section [4] where we explicitly 
construct the pair {T,p). One of the issues in this construction is how lengths in T are 
defined. Our main device for defining lengths is the concept of most recent common 
ancestor. 

Definition 2.1. If is a hierarchy on [n] and i,j G [n], then the most recent common 
ancestor (MRCA) of i and j, denoted (z A is the intersection of all elements of T-Ln 
that contain both i and j, 

{ihj)n:= n G, (6) 

so, e.g., {i A i)n ~ {i} ii i > n. We adopt the convention that if one of i or j is not in 
[n], {iAj)n '■= 0. If ("Hnjri > 1) is hierarchy on N, then we denote by {i A j) the MRCA 
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of i and j in (Hn), which is the following set, 

n>l 

where {i A j)„ is the MRCA of i and j in H„. When no confusion can arise, we some- 
times drop parentheses and subscripts from MRCAs to improve legibility. Also, when 
discussing more than one hierarchy, e.g. (Gn) and ("Hn), we may write {i/\j)g,^ or {i/\j)g 
to denote the MRCA of i and j in Qn or in (Gn)- 

To presage later developments, the family of indicators (l(fc £ {i Aj)),k ^ {ijj}) is 
exchangeable, so the limit 

1 - lim -#{fc e [n] : fc G (i A j)} 

n— ^oo n 

exists almost surely. Also, the MRCA of i and j corresponds to a particular vertex in 
the graph of Tn- the unique vertex found both in the path from root to leaf i and in 
the path from from root to leaf j that is at maximal graph distance from the root. This 
vertex has a counterpart say, in the tree T of Theorem [T] and as will be made clear 
in the proof of that theorem, the distance from root to v will the limit displayed above. 

For comparison with Theorem [T] we state a version of Kingman's representation the- 
orem for exchangeable partitions. Some preliminary definitions are necessary. Suppose 
that ^ is a fixed or random partition of [0, 1] and that (J7„, n > 1) is an IID sequence of 
uniform [0,1] random variables independent of J3^. We say that a random partition 11 of N 
is derived as if by uniform sampling from ^ if 11 is equal in distribution to the partition 
of N which puts i and j in the same block if and only if Ui and Uj lie in the same block 
of (We disregard for the moment the measure-theoretic details concerning random 
partitions of [0, 1].) 

A random partition 11 of N is said to be exchangeable if the random array (p(i, j), i, j S 
N) defined by 

I 1 z and j are in same block of 11 
P(*'^') = \0 else 

is exchangeable, meaning that for every n>l and every permutation a of [n] 

(p(CT(i),(7(j)),i,j e [n]) ^ (p(i, e [n]j. (8) 

The following is a weak version of Kingman's representation theorem for such partitions. 

Theorem 2 ([SB]). //II is an exchangeable partition ofN then there is a U-measurable 
random partition ^ of [0, 1] for which 11 is derived as if by uniform sampling from ^ . 

A stronger version of this theorem is stated in Section [6.3[ It is natural to ask whether 
Theorem [l] might be reformulated to resemble Theorem [2j and such a reformulation is 
indeed possible. Continuing to disregard measure-theoretic details, say that a random 
collection of subsets of [0, 1] is a random hierarchy on [0,1] if conditions (a) and (b) of 
Definition 1 1 . 1 1 hold with [0,1] in place of S. We say that il-Ln) is derived as if by uniform 
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sampling from J^f if (T-Ln) is equal in distribution to the sequence of hierarchies ("H^) 
defined by 

K - {{j e W -.U.eBj-.Be 

where ([/„) is a sequence of IID uniform random variables independent of J^. 

Theorem 3. If{'Hn) an exchangeable hierarchy onN, then there is an {'Hn)-i^so,surable 
random hierarchy Jif on [0, 1] for which (jHn) is derived as if by uniform sampling from 

Theorem [3] is proved in Section [5] as a Corollary of Theorem [l] The rest of the 
paper is organized as follows. Section [3] contains three subsections of definitions, well- 
known results, and elementary propositions needed for the proof of Theorem [l] Section 
[4] contains a proof of Theorem T Some complementary discussion and miscellaneous 
results may be found in Section |6 



3 Preliminaries 

3.1 Real trees and hierarchies derived from real trees 

Definition 3.1. A segment of a metric space X is the image of an isometry a : [a, 6] ^ X. 
The endpoints of the the segment are a(a) and a{b). A real tree is a metric space [T , d) 
for which 

(a) for every pair x,y oi distinct elements of T there is a unique segment with endpoints 
X and y, denoted [[a;, y]], 

(b) if two segments of T intersect in a single point, and this point is an endpoint of 
both, then the union of these two segments is again a segment, 

(c) If a segment contains distinct points u, v then it contains 

(d) if the intersection of two segments contains at least two distinct points, then this 
intersection is a segment. 

A real tree is rooted if there is a distinguished element of T called root. Every real tree 
we will discuss will assumed to be rooted, with root denoted 0. Furthermore, we define 
[[x,x]] = {x}. 



In fact, parts (c) and (d) of Definition 3.1 follow from parts (a) and (b). For more 



regarding real trees see the excellent course notes [33]. The following example, however, 
provides sufficient background on real trees to understand the proof of Theorem [T] 

Example 3.1 (Line-breaking and a random real tree, following Aldous). Let ii denote 
the Banach space of absolutely summable real sequences, and let e,; denote the ith 
element of the usual basis, so that ei — (1, 0, 0, . . .), e2 — (0, 1, 0, 0, . . .), and so on, and 
let (L„) be a sequence of positive numbers. We define a family of real trees as follows: 
first, let Ml = (0, 0, . . .) and let 

Ti = ui -Fei[0,ii] := {(0,0,...) +eia; : < a; < Li}. 
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Next, select a point U2 from 7i and let 



Ta = Ti U U2 + e2[0, L2] := Ti U {ua + 632; : < x < L2}. 

We continue recursively: supposing Tfc has been defined, we select a point u^+i from 7fc 
and set 

%+i = 71- U ?^fc+i + Gfe+i [0, £/c+i] , 

and let T be the closure of the union lJn>i'^- The tree Tfc is therefore built up by 
"gluing together" k line segments, and if we endow T with the li metric the geodesic 
paths in Tfc flow along these line segments as one would expect. 

The idea of using the natural basis of li in order to obtain a countable family of 
"orthogonal directions" in which to grow the new branch of Tfc , is due to Aldous [2 . 

To get a random real tree, simply randomize the construction above. For example, 
let (Lfc) be the interarrival times of a Poisson process of on M>o of rate t dt, and for k > 2 
select Uk according to normalized length measure on Tfc. The resulting tree is Aldous 's 
Brownian continuum random tree. 

We have defined in Q the hierarchy derived from T and a sequence {tn,n e N) of 
points of T, but to make the definition precise we need to define the fringe subtree of T 
rooted at a point a; S T, a concept used informally at ([s]). 

Definition 3.2. If T is a real tree and x point of T, then the fringe subtree ofT rooted 
at X is the set 

F.{T) :={yer:xe [[0,j/]]}. 

Proposition 4. Let T be a real tree. Then for x,y ^ T, either Fx{T) C Fy{T), or 
Fy{T)cF,{T), orF^{T)^Fy[T). 

Proof. We claim that for all points x,y,t G T, 

(i) if a; e [[0,y]] and y G [[0,t]] then x <E [[0,<]], and 

(ii) if [[0,y]] -AnAyi [[0,x]] then F,{T) f^ Fy{T) ^ . 

If X, y, t are distinct non-root elements of T then (i) above follows from two applications 



of Part (c) of Definition 3.1 Likewise, if x, y, t are distinct non-root elements of T and 
X ^ [[0,y]] and y ^ [[0,x]], and t e F^{T) n Fy{T), then the segments [[t,x]] U [[a;,0]] 
and [[t,y]] U [[j/,0]] are distinct (y is not in the first, x is not in the second), and since 
these segments have the same endpoints we arrive at a contradiction with Part (a) of 
Definition |3.1[ and (ii) follows. If x, y, t are not distinct or one if one or more of these is 
the root of T, one may easily argue by cases. □ 

Corollary. If T is a real tree and {tj,j > I) a sequence of points ofT then the sequence 
(Hn) defined by 

'Hn := {{] e N : tj e F^{T)) : X e T} U S([n]), 
is a hierarchy. Here, S([n]) is the trivial hierarchy on [n] defined at (Tl). 
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3.2 Random hierarchies: details 

In this section we prove the foUowing elementary proposition and show that hierarchies 
on N are in bijective correspondence with certain binary arrays. 

Proposition 5. 1. Ifn> 1 andUn is a hierarchy on [n], then 

Un - {{i^J)n : e [n]}US([n]) 

where denotes the trivial hierarchy on [n] and {i/\j)n the MRCA of i and j 

in Tin- 

2. If {Hn, n > 1) is a hierarchy on N then for every n 

{iAj)n[n] = Aj)„, 

where {i/\j) and {i/\j)n denote the MRCAs ofi and j in {T-Ln) and'Hn, respectively. 

Proof. 1. Note that the subset of Hn consisting of sets that contain i is totally ordered by 
inclusion, by part (a) of Definition |f .1[ The smallest member of this class that contains 
j is then {i A This shows that 

H„^{(^AJ■)„:^,je [n]}UE{[n]). 

To prove the reverse inclusion, fix a; e Hn and i G x. The class {(« A j)„ : j £ x} is 
totally ordered by inclusion, with maximal element (z A say. Then for all fc e 
k G {i A k)n ^ (j A j')m so X C (j A On the other hand, i, / G x and therefore 

{i A j')n C X. This proves the reverse inclusion. 

2. By consistency of the sequence (Hn), for every n > max{i,j}, 

[n]n fl fl G. 

GeH„+i:{l,j}CG Ge«„:{lj}CG 

It follows that (z A j)„ C (lA j)„+i for every positive n (recaU (i Aj)„ = if max{i, j} > 
n). The second assertion follows from this and the fact (« A j)„ C [n]. 

□ 

Proposition [5] shows that if {Hn) is a hierarchy on N, then the class {{iAj) : i,j & N} 
contains complete information about (Hn), where (i A j) denotes the MRCA of i and j 
in {Hn). More explicitly, if the MRCA of i and j in (Hn) is known, then by restriction 
we obtain for every n the MRCA of i and j in Hn, and Hn consists precisely of such 
MRCAs. The collection {{i Aj) : i,j E N} can be conveniently encoded by the following 
array, 

Such an array has two notable properties: 

(a) For all triples i,j,k e N, k{i,j,k) = A{j,i,k); also k{i,j,j) — 1, and furthemore 
A(i, i^k) = 1 if and only if i = fc. 
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(b) For all pairs i,j and m, n of elements of N, either the two sets 

{ke S : A(i, j, k) = 1} and {k e S : A(m, n, k) = 1} 
are disjoint, or they are equal, or one of them contains the other. 

Property (a) follows from symmetry of the roles of i and j in ( 
{i Ai) = {i}. Property (b) follows from part (b) of Definition 

Proposition 6. The correspondence ^ between hierarchies and binary arrays A : 
{0,1} having properties (a) and (b) directly above, is bijective. 

The proof of this proposition is elementary and is therefore omitted. 
3.3 Exchangeable Compositions 

A composition of a set is a partition of S together with a total order on blocks of this 
partition. Starting from such a pair one obtains a binary array R by setting R(j, j) = 1 if 
either i and j are in the same block of the partition, or the block containing i precedes the 
block containing j, and otherwise setting R{i,j) = 0, for all pairs i,j G 5. A binary array 
so derived necessarily has the following four properties, which hold for all i,j, k d S. 

(i) R(z,i) = 1 

(ii) if R(i, j) = then R{iJ) = 1 

(iii) if R{i,j) = 1 and R(j, fc) = 1 then R{i, fc) = 1 

(iv) if R(i, j) = and R(j, fc) = then R{i, fc) = 

Conversely, starting from a such an array R one may define an equivalence relation ~ on 
Shy 

i ^ j ii and only if R{i, j) — R(j, i) — 1; 

then the equivalence classes of ~ form a partition of S, and we may totally order these 
classes by declaring that [i] precedes [j] if and only if R(«,j) = 1, for aU pairs i,j in S. 
This correspondence between a composition of a set S and a binary array R is obviously 
bijective. By (i)-(iv) above, the map 

R^{{i,j)eS^ ■.R{i,j) = i} 

sets up a bijective correspondence between such binary arrays R and binary relations on 
S that are reflexive, total, transitive, and whose complements are also transitive. Such 
relations need not be antisymmetric, and therefore need not be total orders, but every 
total order is such a relation. By abuse of notation we may use i R j and R{i,j) — 1 
interchangeably. 

We will find it more convenient to work with arrays than with totally ordered set 
partitions or binary relations, so for our purposes, a composition of a set S will mean a 
binary array R : S x S i-^ {0, 1} for which properties (i)-(iv) above hold. If is a finite 
or countably infinite set then an exchangeable composition on S is a random composition 



6|, and from the fact that 
1.1 
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R for which for every finite subset So of S and every permutation a of So, there is the 
distributional equality 



(R{a{i),a{j)),i,j G S'o) = (R{i,j),i,j G Sq 



Theorem [7| befow is a de Finetti-type characterization of exchangeable compositions, 
originaUy given in [551 Theorem 11] and [551 Theorem 5]. Before stating the theorem we 
must say a few words about left-uniformization. 

By the left-uniformization F^, of a distribution F, we mean the image of F via the 
map X 1-^ Fi(x) from M to [0,1], where Fi denotes the left-continuous version of the 
distribution function of F. That is, 

F40, a] = V{Fi{X) < a) for X with distribution specified by 

V{X < x) = F(-oo,x], and Fi{x) = lim^^^ F{-oo,w] = F{-oo,x). 

It is well-known that if is a continuous distribution function, then F^ is the uniform 
distribution on [0, 1]. More generally, if the discrete part of F has atoms of magnitude fi 
and locations Xi, where fi>0 and fi ^ then i^* is characterized by the following 
three properties: 

• (i) the distribution F^, has an atom of magnitude fi at Ui G [0,1], where Ui = 
F{~oo, Xi), for each i; 

• (ii) the distribution F^ places no mass on the interval li :— {ui, Ui + fi), for each i; 

• (iii) the continous component of F^, is the restriction of Lebesgue measure on [0, 1] 
to the complement of Uili. 

We say that F is left-uniformized if F^. = F. 

Theorem 7 ( [38l Theorem 11] and [28l Theorem 5] ). //R is an exchangeable compo- 
sition of N then the limit 

X, = lini — #{n G {1, . . . , to} : R(j, n) = 0} (10) 

exists almost surely for every j G N. The family {Xj,j G N) so defined is exchange- 
able, and the directing measure of the family is left-uniformized with probability one. 
Furthermore, almost surely for all pairs j, k, R(j, k) = 1 if and only if Xj < Xk- 

Sketch of proof. For every j > I, the family (Y^, n > 1) defined by 
Y^:^R{j,n') neN, n' := 




is exchangeable, and Xj = limm-s-oo rn~^ X^fcLi ^1 ■ '^^'^ ^--S- existence of the limit in ( 10 ) 
is therefore a consequence of de Finetti's theorem. Part of checking that the family (Xj) 
has the asserted properties involves showing that if R(j, i) = then Xi < Xj, and similar 
arguments using exchangeable sequences derived from R shows that this implication holds 
almost surely. The remainder of the argument is straightforward. 

□ 
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root 



Figure 2: The ith spinal spinal composition associated to a hierarchy is the partition 
of leaves of the hierarchy into blocks according to attachment point on the spinal path 
from root to leaf i, together with the following ordering on these blocks: block s precedes 
block t if the attachment point for block s is nearer the root than the attachment point 
for block t. 



3.4 Spinal Compositions 

Definition 3.3. If {fin, n > 1) is a hierarchy on N and i an element of S, the i*'* spinal 
composition o/N \ {i} is the binary array defined by 

R,(j, fc) = A«(*,j, fc) {j,ken\{t}) (11) 

where A is the binary array associated to {Hn) defined at ([£]). 

The z*'' spinal composition of N \ {i} associated to a hierarchy [T-LmU > 1) can be 
described less formally as follows in terms of the graph of Hn defined in Section [T] 

For 1 < i, j, k < n, draw the path from root to leaf i in the graph of Hn- 
Traverse the vertices of this path starting at root and moving towards i, and 
keep track of which vertices contain j and which contain k. If every vertex 
that contains j also contains fc, then Ri{j, k) = 1, otherwise k) = 0. 

See Figure [2] for a depiction of a spinal composition. 

It is easily checked that Rj so defined is a composition of N\{z}. Furthermore, if (Tin) 
is an exchangeable hierarchy on N then R.^ is an exchangeable composition of N \ {i}. A 
version of Theorem [t] then holds, showing the existence of [0,l]-valued random variables 

^#{n€[m]\{z}:Mj,n) = 0} jeN\« ^^^^ 

The random variables E N \ {i}) are exchangeable and have a driving measure 

that is left-uniformized almost surely, and for j,k G N\ {i,j}, Ri{j,k) = i{Xj < XI) 
holds a.s. We call these variables (Xj) spinal variables. 

Proposition 8. Let {Hn) be an exchangeable hierarchy on N, let A-^ and (Ri,i G N) 
be the binary array associated to {Hn) as by M) and the family of spinal compositions 
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associated to (Hn), o,nd for every i Cz N let (^j, J G N \ {i}) denote the family of spinal 
variables associated to by {12). Then for all i,j,k G N with i ^ {j, fc} there is the 
almost sure equality of events, 

{X] < XI} = {R,(j, fc) = 1} = {knit, j, fc) = 1} - {{i A fc) C (i A j)}, (13) 

where for i, j G N, {i A j) denotes the MRCA of i and j in (Hn)- Also, 



XI 



lim — 4i^{n G {1, . . . , m} : n ^ (i A j)} 



(14) 



holds with probability one for all distinct i.j G S. Finally, with probability one, for 
distinct i, j, k, I in S , 

(i) X]^Xl ifi^j, 

(a) (i A j) = {m e S : X^^ > Xj or m = i}, 
(iii) XI < Xl implies Xl = X]. 



Proof. The almost sure equalities in ( 13 ) are immediate consequences of definitions; ( 13 ) 
simply collects them in one place for easy reference. For ( 14 1 we note that for every 
distinct triple z, j, fc of distinct elements of S, there is the almost sure equality of events 

{R,(j» = 0}-{n^ (zAj)}. (15) 

which is immediate from (|9]) and (11). Now ( |l4| ) follows from ([l5| and (12). 

Assertion (i) follows from ( [T4| and the fact that (iAj) = {j Ai). Assertion (ii) follows 
from ( 13 ). 

For assertion (iii), suppose that X^, < Xj,, then from (i) and (14) we have (j A fc) C 
(i A fc). We will show that (z A fc) = (i A j), by (14) this is enough for (iii). Already it 
is plain that (iAj) C (z A fc); it will suffice to show that fc G {iAj). We proceed by 
cases: since (z A fc) n (i A j) ^ 0, either (z A fc) C (z A j), in which case we are done, or 
{i A j) C (z A fc). So assume that (z A j) C (z A fc). 

• If {j A fc) C (z A j), then fc G (z A j) and we are done. 

• If (z A j) C [j A fc), then (z A fc) C [j A fc), which is absurd, since (j A fc) C (z A fc). 

Because {j A fc) n (z A j) ^ 0, one of the two buUeted cases above must obtain, and we 
conclude that fc G (z A j) as desired. 

□ 



4 Proof of Theorem [T] 

Let {Hn,n > 1) be an exchangeable hierarchy on N. For reasons that will soon become 
clear, it will be much more convenient to work with a hierarchy on Z rather than on N. 
Therefore fix an arbitrary bijection : N i-^ Z, and for every n > 1 set 

Hn ■■= {{b{k) : fc G (z A j)} n [±rz] : z, j G N} U S([±n]), (16) 
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where [±n] {—n, . . . , 0, . . . , 71} and S is defined as at ([3|, and (i A j) is the MRCA of 
i and j in (H^). Then "Hn is a hierarchy on [±n\ and "Hn 



[±nl 



= Hn ior every n > 1. 



We still need the notion of MRCA in Tin and in (Hn)- Happily, Definition 2.1 makes 
sense in the present context with obvious minimal changes, e.g. reading [±n\ for [n]. 
We will also need some auxiliary hierarchies, defined as follows. 

Definition 4.1. For integers i < 0, fc < 0, and n > 1, and S defined at ([s]), and (i A I) 
the MRCA of i and I in we set 



K -.^ {{i Al) n [n] : / G Z}US([n]), 



(17) 



k 

Gl {(* A : * € {-1, • . . , fc}, ^ G Z} U = |J (18) 

— 00 

e„ := {(i A : * < 0, / e Z} U ~([n]) = |J (19) 

i=-i 

It is easily checked that T-L]^ , Gn , and Gn defined above are hierarchies on [n] . We now 
outline of the proof of Theorem [l] 

(i) We define for every fc < 1 a random tree Tk and a sequence {tj,j > 1) of random 
elements of Tk ■ Both the tree and the samples are contained in ii , the Banach space 
of absolutely summable real sequences. We define another tree T and samples 
itj,j>l) by 

T:=cl[\Tk, t,:^ lim t'^ , (20) 

k<l 

where cl denotes £i-closure, and the limits exist almost surely. Both {tj,j > 1) 
for fc > 1 and (tj,j > 1) are exchangeable, and for the measure p of Theorem [l] 
we take the directing measure of the sequence (ti,t2, . . .). For A; < —1 we let pk 
denote the directing measure of (tj). These random measures (pk) are not used 
in the proof of Theorem [T] but see Figure [3] for an image of how pk and pk-i are 
related. 

(ii) We show that G71 is the hierarchy derived from T and the samples (ti,...,i„), 
almost surely for all n. To do this, we first take the following intermediate step: 

(ii a) We show that Gn is derived from Tk and the samples {ti, . . . ,t'l^), almost surely 
for all n > 1, fc < —1. After taking this intermediate step, we establish the 
assertion of (ii) by taking a limit as fc — > — 00. 

(iii) We show that the hierarchy {Gn,n > 1) is equal in distribution to the hierarchy 
{H'n, n > 1) on N with which we started. 

Steps (ii) and (iii), taken together, prove Theorem [1] The proof of Theorem [l] occupies 
the remainder of this section, and is broken into parts according the outline above. 
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Figure 3: At top is shown the graph of Hn with leaf labels erased. The bold paths arc 
the spinal paths to leaves —1 and —2, respectively. In the middle, {T-2,P-2) is shown. 
The arrows indicate the £i basis directions, and atoms of p-2 are represented by black 
circles or beads on T2, with circle size corresponding to atom size. At bottom is shown 
(71i,P-i). Note that (r_2,P-2) is derived from (71i,P-i) by "crushing" a bead on T-i 
into fragments and stringing the crushed bead fragments out in the 62 direction. 
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4.1 Part (i) 

Our main tool for constructing the real tree T and the samples (ti,t2, ■ • •) of Theorem 
[l]is the collection of [0, l]-valued spinal variables associated to spinal compositions, i.e. 
the family {Xj,i,j E Z,i ^ j) defined by 

X}:^ lim ^#{ne [±m] (zAj)} (^,JeZ,^^J) (21) 



where (i A j) denotes the MRCA of i and j in (Tin)- We adopt the convention that 
X" = 1 for n E Z. Obviously Proposition [8] remains true in this context with minimal 
changes. It is worth emphasizing at this point that superscripts i and k on Xys and tj's, 
Gn 's and I,^ 's (to be defined later) will be negative, and when taking limits we send k to 
— oo rather than oo. 

Definition 4.2. Let {ej,j > 1) be the natural basis of £i, so that ei = (1,0,0,...), 
62 = (0,1,0,0,...), etc., and for m > 1 let 7r,„ denote the orthogonal projection onto 
span{ei, . . . ,e^}, so that TTjn{{xi,X2, ■■■)) = (a^i, ■ ■ ■ ,Xk,0,Q, . . .), and 7ro((a;i, a;2, ■■■)) = 
(0,0,...). 

Following Aldous [2], for x S £i let [[0,a;]]sp denote the path that proceeds from to 
X along successive directions, for which [[0,a;]]sp equals the closure of [[0,x]]°p, where 

[[0,x]]:p U {t7:m{x) + (1 - i)7r„+i(x) : < i < 1} (22) 

m>0 

Observe that [[0, x]]sp differs from [[0, x]]°p only when x ~ (xi, X2, ■ ■ ■) does not terminate 
in zeros, i.e when Xj > for infinitely many j, and in this case the set difference 
[[0, x]]sp \ [[0, x]]°p consists of the singleton {x}. 



Definition 4.3. Let {Xj,i,j S N,« ^ j) be the spinal variables defined in (21). For all 

j > 1, set tj^ = eiX~^ and for every k < —2 set 

fc 

:=eiX7i+^ejmax{0,Xj-max{X-\...,X^-i}} (j > 1). (23) 

1=2 

Define a family of trees (Tfe, k < —1) using the samples (ij , j > 1, < —1) as follows: 

7I--cl|J[[0,tjl],p, 

where cl denotes closure in £i. For every A: < — 1 let dfe be the ^i-metric on Tfe, and let 
Tfc be rooted at £ £i. 

It is easily checked that for every k < —1, {t'j,j > 1) is an exchangeable family. 
Observe that by definition, = max{X~\ . . . .XH < 1. 



Definition 



4.3 



of the samples (ij) can be described as follows: once the samples {tj) 
and tree 71- have been defined, to define it'^~^) we select a subset of samples among 
those remaining and push these out in the en,_x|-direction, orthogonal to Tfc (this subset 
may possibly be empty). The next proposition shows that every one of these samples is 
selected from the same spot on Tk] that is, 7fc-i is derived by adding a single branch to 
Tfc (or perhaps not adding a branch at all). 
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Proposition 9. For every k < —I, the set {7r|fc|(t*^ ^) : t^' ^ ^ t^} is either a singleton 
or the empty set. 

Proof. Suppose that X^^'^ > ma.x{Xj, X^}. Then for every i e {-I, . . . , k}, X!- < 
X^-^. Thus by part (hi) of Proposition [sj for every i e {-1, . . . , -k}, X] = Xl^^. We 
have shown that 

3 e {i^uitf^) : tY^ + t^ imphes {X],...,X]) = (Xl_„ X^,) 

and we note that t^ is determined by {Xj, . . . ,X^) to conclude that {T^\k\{t'j~^) '■ 7^ 
t^ is a singleton. On the other hand, on the event that X^^ < maxjXj^, . . . ^X^}, for 
every j < then t^^ — tj for all j, and the set in question is empty. □ 

From the definition of Tk it can be seen that Tfc is a real tree with probability one. It 
follows from Proposition |9] that Tk is furthermore a real tree derived by a line-breaking 
construction, like the tree in the Example in Section [3T| 

4.1.1 Examples 

The following two examples are not part of the proof of Theorem [T] but together with 
Figure [3] they may help the reader visualize the construction of the tree T. 

Example 4.1. Let {Un,n e Z) be a family of IID uniform[0,l] random variables, and 
let 

■■= {{.?■ e [±n] ■.Uj>x}:0<x<l}U S([±n]) 
Following the construction above it can be seen that 

Ti :=ei[0,C/_i], 

that p_i is length measure on ei[0, and that eiC/_i is an atom of p_i of size 

1 — [/-_!. Now let ki — —1 and define a sequence {km,rn > 1) recursively by k„i+i '■= 
max{i < : (7i > UkJ\. Then (T_i,p_i) = . . . = {Tk^+i^Pk2+i) and 

Tfe, :=riUeiC/_i+efc,[0,C/fe, -C/fcJ, 

i.e., Tk2 is an isometric embedding of [0, C/fe^] in ii, with a kink or bend at the image of 
U-i in ^1. The measure pk^ is the sum of two measures: length measure on 7/c2j and 
an atom of size 1 — C/fe^ at the "end" ek^Uki + Gk^Uki- In general, 7fc„ is a an isometric 
embedding of [0, Uk^] into £i with \km \ — 1 kinks, and pk is length measure on 7/c„ plus 
an atom of size 1 — Uk^ at the end of T^^ . The limit tree T is an isometric copy of 
[0, 1], embedded in £i, and p is length measure on T. The tree T has one leaf (nonroot 
element whose removal does not disconnect the space), and this leaf has p- measure 0. 

Example 4.2. Let % denote the following collection of subsets of [0, 1], 
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Let (f7„, n e Z) be a family of IID uniform [0,1] random variables, and let 

Hn := {{j e [±n] :UjGB}:BGn}U S([±n]) 
Following the construction above it can be seen that 

Ti :=ei[0, 1], 

and that p-i is purely atomic. The atoms can be described thusly: with /i := and 
/„ := X^^r^ 2""', the atoms of p_i are at the locations {ei/„}„>i, and p_i({e„}) = 

2-n-l_ 

In fact, for every A: < 0, the measure pk is purely atomic. The atoms can be visualized 
as beads on the strings (segments) that constitute Tk- To create the next tree 7fe-i, one 
of the atoms of Pk is selected with probability proportional to size and crushed into a 
sequence of smaller atoms, which are then strung out on the new string, respecting left- 
uniformization except at location of the crushed atom. More explicitly, suppose that Tk 
has been defined, and that the selected atom x has p^-mass 2~™ for some m. It will 
follow from the construction that the distance from a; to € is 1 — 2~™+^, and that 
for some finite increasing sequence 1 < ii < 12 ■ ■ ■ < ij-i < ij = m, x looks as follows, 

X = (/h , 0, /i, - /i-i, 0, 0, 0, fi, - /i„ 0, 0, . . . , 4 - , 0, 0, 0, . . .), 

say; i.e. x is derived by thinning the vector (/i^ , fi^ - , - /i^ , . . . , fi^ - fi^.J with 
zeros. Suppose that fi. — fi-_^ is found in the Zth coordinate of x] this indicates that the 
branch on which atom x is found was added at the Zth step of the construction. Now, 
to create the next tree Tk-i, set 

Tfc-i =Tkl^x + e|fc|+i[0, 2-™+i], 

ie. add a new branch at x in the e|fc|+i-dircction so that the total distance from root to 
tip of the new branch is 1. Note that for every point y in the new branch, there will be 
+ 1 — Z zeros between the penultimate and final nonzero entries of the y; this explains 
"zero-thinning" . 

The new measure Pk-i equals pk oiiTk\{x}. The atom x is crushed, and Pk-i{x) = 0; 
crushed bits of x are strung out on the new branch, so that Pk-i has atoms at the 
following locations, 

(/ii, 0, fi, - fi-i, 0, 0,0, fi, - /i„ 0, 0, . . . , 4. - 0, . . . , 0, 2" - 2™, 0, 0, . . .) (n > m) 

ie. at a; + e|fe|+i(2"-2™) for every n > m, and + e|fe|+i(2" - 2™)) = 1-2""-!. 

The limit tree T has uncountably many leaves. The measure p is supported on these 
leaves, and on the set 

{x G £1 : TTjix) ^ X for all j > 1}, 

because for every j > 1, all atoms on 71 j are eventually selected, crushed, and strung 
out, off of the set \x ^ l\ : i^j(x) = .7;}. It can also be seen that p is diffuse (i.e. 
nonatomic) , because every atom is eventually crushed into smaller atoms, so no atom of 
positive mass can remain in the limit. 
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4.2 Part (ii)a 

For n > 1 and k < —1, let I^' denote the hierarchy derived from Tk and the samples 
{t^,...,t^), that is, 

■■= {{j e W : ^ e [[0,i,']].p} : a; G Tk} U S(M). 

Proposition 10. For all positive integers n and k, ~ I,^ almost surely. 

Several intermediate results are needed to prove Proposition |10[ 

Lemma 11. Let H be a hierarchy on a finite set S , and suppose that i € B € H. Then 
almost surely there is j d S such that B = (i A j), where (i A j) denotes the MRCA of i 
and j inli. As a corollary, if every element of % contains i, then 

{(j A : J, ^ e j^l} = {{j As):j^s} a.s. (24) 

Proof. Fix B E H. By the argument for Proposition [5j 

B= \J{sAj). 

Since the members of {s A j : j € B} all have the point s in common, by part (a) of 
Definition they are totally ordered by inclusion. Therefore there is some maximal 
element / of B for which B = {s A j'). □ 



Lemma 12. With V.^ as in Definition 4-1 



{Bn[n]:ieBenn}=ni, = {{j e[n]:X'j>x}:Q<x<l} 
holds almost surely for all n > I and i < 0. 

Proof. This follows from Lemma 11 Proposition [s] and part (ii) of Proposition |8] □ 
Lemma 13. For all k < 0, and x E Tk, 

{j > : t';-' G F^{Tk-i)} = {j > : t'^ G F^{Tk)}. (25) 
holds almost surely. 

Proof. If t^~^ is in Fx{Tk-i) for x E Tk then x G [[0, tj"^"^]]sp7 so x — T^ikiix) G 
T^\k\{[[0,t';-%p) = [[O,t%p,sot^ G F^iTk). On the other hand, [[0,t%p C [[0,t^-%p 
so t'^ G F,{Tk) implies t'^^^ G F^{Tk^i). □ 

Proof of Proposition \lC\ Since tj :~ ^i^j for j > 1, for k = —1 the assertion is covered 
by Lemma 12 We proceed by induction on k and argue by cases. Throughout, (i A j) 
denotes the MRCA of i and j in {Tin) and (i A j)„ denotes the MRCA of i and j in H„. 

The first case is that the following event occurs: Tk = Tk-i and = t^ for every 
3 > 1, or otherwise put, there is i G {—1, . • . , — fc} so that A"j > X^^^ for all j > 1. 
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Noting that {i A j)n H (fc — 1 A j)n 7^ for sufRciently large n, from (h4| and > 
we have (« A j) C (fc - 1 A j). Thus i G (fc - 1 A j), so 

{(/c - 1 A j)n n [n] : j e [n]} C {(i A j)n H [n] : j G [n]}, 

so by Lemma [TTj 

and it follows that Gn^^ = Gn- Since — Z^^^ in this case, by the induction hypothesis, 

The second case is that the following event occurs: Tk C Tk-i, or otherwise put, for 
some j, Xj"^ > msLx{Xj, . . . , X^}. It is enough to show two inclusions. 



k-l 



if^Gf and nt'^i: 

to conclude that = Gn~^- 

For the first inclusion, we claim that for every point x in the "new branch" 7fc-i \ Tk, 
the set {j G [n] : t^-~^ G Fx{Tk-i)} of is also in Gk-i- Therefore fix such x and 

assume without loss of generality that the set in question is nonempty. If x happens to 
equal t^^~^ for some jo G [n] then set x' — x, otherwise proceed along the new branch 
in the 'outward'/away-from-zero/increasing-norm direction until encountering the first 
sample with jo G [ti], and set x' — t^^~^ . More precisely, define x' by 

x' — an element of {^^^^ : tj^^ G Fx.{Tk-i), j G Sq} with minimal £i-norm, 



Then {j G [n] : t)-^ G F^iTk-i)} - {j G [n] : t)-^ G F,,{Tk+i)}. According to (23l, 



{] e [n] : G F,,(rfe-i)} = {j G [n] : X^-^ > x!^-^} for every element jo of 

{t'^^^ : G Fx{Tk^i),j G [n]} that has minimal ii norm among members of this set. 
According to Proposition [sj^ii), {j G [n] : X^~'^ > X^~^} = (A:- in jo)„n [n], which is an 
element of Gn^^ ■ The claim is proved, and in conjunction with the induction hypothesis 
and Lemma [131 the first inclusion follows. 

For the second inclusion, note that if (fc — 1 A Z) contains i G {—1, . . . ,k + 1} then 



(A; — 1 A Z) n [n] appears in Gn and hence in Gn ^ by Lemma 12 and Lemma 13 On the 
other hand, if (A: — 1 A Z) is disjoint from { — 1, . . . , fc + 1} then 

(fc - 1 A /) n [n] = {j G [n] : X^ > Xj} = {j G [n] : t'; G F,{Tk)} where 
X is the unique point of Tfe \ Tk+i at distance Xj' from root. 

The second inclusion follows, and we conclude that Tn^^ — Gn^^- 

The two inclusions taken together show that on the event Tk C Tk+i, we have = 
Gn~^ almost surely. This completes the inductive proof. 

□ 

4.3 Part (ii) 

Proposition 14. T^i C T_2 C . . . almost surely, and the limits 

hm t'; {j>l) 

exist almost surely and are members ofT:= cl [Jf.^_^7'k, where cl denotes £1- closure. 
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Proof. By (21 1 the spinal variables (Xj) variables take values in [0,1] almost surely. 
Observe that by definition, ||t*^|| = ma.x{X^^ , . . . , X^} < 1, and 7r|fc|(i)'"^) = t'^. The 
assertions of the proposition follow from definitions and these two facts. □ 

Let I„ be the hierarchy derived from T and the samples (ii, . . . , i.e. 

In ■■= {{j e [n] : tj G F^iT)} : x e T} U E{[n]) 

where S([n]) is the trivial hierarchy on [n\. 

Proposition 15. For every positive integer n, I„ — Qn almost surely. 

We need the following lemma. 

Lemma 16. For every pair u, v of positive integers, (u A w) n {—1, —2, . . .} is nonempty 
with probability one. Here, (u A v) denotes the MRCA of u and v in (T-Ln)- 

Proof. Define a family {Wj,j e Z \ {i}) by 

I otherwise 

For distinct integers ji,j2, the event {Wj^ = Wj^ — 1} is null set, because Wj^ — Wj^ = 1 
implies that {ji,u} and {j2,u} are both members of Hn for all sufficiently large n, 
contradicting part (b) of Definition |1.1[ Therefore there is almost surely at most one 
1 in the sequence (Wj). Since (Wj) is easily seen to be exchangeable, by de Finetti's 
theorem Wj — almost surely for all j. 

It follows that {Aff(u,v,j),j G Z \ {i,j}) is a family of Bernoulli variables with at 
least one 1, almost surely (see ^ for the definition of A). Since {Ah{u,v, j)) is an 
exchangeable family, the conclusion follows from de Finetti's theorem. 

□ 



Proof of Proposition 15 Set T° := Ufe<-i ^ii'i •= T\T° ■ By Proposition 10 and 



Lemma [131 it follows that 

- U = U ^ ^ N ■ ^ ^-C^)} : ^ e r°} u s([n]) 
fc<-i fc<-i 

holds for every positive integer n. It remains to establish that 

{j e [n] : tj € F^iT)} C g„ (26) 



holds for every x G dT. If the set in ( 26 1 is empty or a singleton it is in C/„ by definition, 
therefore without loss of generality suppose {u,v} C {j g [n] : tj G Fx{T)} for some 
distinct pair u,v E [n] and x E dT. We will derive a contradiction. 

We claim first that given these assumptions, t^ = x = t^j almost surely. To see this, 
note that since x £ dT, x = {xi, x^, . . .) does not terminate in zeros, ie. xi infinitely 
often. It follows that i„ does not terminate in zeros, i.e. tu E dT, for otherwise x could 
not be in [[0,tu]]sp. Now, by definitions it follows that the only point of [[0,t«]]sp that 
does not terminate in zeros is tu itself, so since x E [[0,t„]]sp (because tu € Fx{T)) we 
must have x = tu, and similarly for t^. 
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Since x — — linifc„j._oo t„ is in dT, it follows that there is a subsequence k„i of 
{-1,-2,...} for which 

\\tt'\\<\\tt'\\---- (27) 
Let {km,rn > 1) be the subsequence of { — 1, —2, . . .} consisting of the times at which 
< — 1) exceeds its past maximum, 

ii = -l := max{i < fc„ : > X^"}. 

Since — max{Xj7^, . . . , X^}, this sequence (fcm,w < 1) is well-defined. Now by 

( [T4| it follows that 

{ki A u) 2 (fc2 A u) D . . . 

is a strictly decreasing nested family of sets. We claim that v G nm>i(^m ^ This is 
apparent from the proof of Proposition |10[ where it is shown that 

{k„^ Au)n [n] = {j e [n] : t)'- e (Tk)}, 

since ij" = T^ik^iitv) = T^ik^iitu) = iJl"- 

Since u and v are both contained in nm>i(^m^'")' follows that (uAw) C nm>i(^"i^ 
u). From Proposition 16 there is then with probability one a negative number - we can 
let i denote the maximum such number - for which i G nm>i(^™ A K follows that 
{i Au) C [km A u) for all m, so that > X,^™ for every m by (21 ), contradicting the 
definition of (fcm). 

We have obtained the desired contradiction. It follows that for every fixed x E dT, 
the set {j € [n] : tj € Fx^T)}, if nonempty, is with probability one a singleton and 
therefore an element of Gn- Now observe that 

{{j e [n] : t, e F,(r)} : x € r}US([n]) = {{j G [n] : t, G Ft,(r)} : ^ G N}US([n]) a.s. 
Proposition [T5| follows. □ 



Remark. The proof of Proposition 15 shows that the restriction of p to the set T \ 
Ufc<-i'^ is diffuse, i.e. nonatomic. 

4.4 Part (iii) 

Proposition 17. T/ie hierarchies {Gn,ri > 1) and [T-L'^^^n > 1) are egwa/ in distribution. 

It should perhaps be pointed out again that {1-L'^,n > 1) is the hierarchy on N with 
which we started, i.e. with which we defined the hierarchy [T-Ln) on Z. We will need the 
following lemma: 

Lemma 18. The following equality holds almost surely, 

{{i A j) n [±n] : J G Z, z < 0} - {{I A j) n [±n] ■.l,je Z}, 
where {i A j) and {I A j) denote MRCAs in 

Proof. We need only prove the "3" direction of the equality. Fix l,j in Z, and let ^.j be 
the directing measure of the exchangeable sequence {Xl,n G Z \ {j}). There are three 
cases to consider. 
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• Xj is an atom of the directing measure /ij. Then there is almost surely some 
negative integer i for which Xl — X^ . Then by Proposition [s] part (ii) , 

{iAj) = {keI.:Xi> Xl) ^{keZ:Xl> Xf} - ij A /) 
and the claim follows. 

• Xf is not an atom of /ij . Recalling the discussion of left-uniformization preceding 
Theorem [7] it can be seen that with probability 1 there is some negative integer i 
for which ma.x{Xl :£ [±n], A: ^ {j A I)} < Xl < Xj . Then 

{i A i) n [±n] ^{me [±n] : > X^ = {m G [±n] : X^^ > Xf} ^ (l A j) f] [±n\. 

The third case, on which we need not linger, is that a probability zero event occurs, e.g. 
Xf lies outside the support of ^j. □ 

Lemma 19. There is the following equality in distribution for all n, 

K = {{l^3Ur^[n]■.l,Je[n]i^s{[n]) 

where {I A is the MRCA of I and j in (H„). 



Proof. Let us say that (161 defines as image of (H^) under b, and write (Hn) = 

b{{Hn)) to express this succinctly. Let c be a bijection from Z to Z, and let (Tin) = 
c{{'Hn)) be the image of {Hn) under c. By exchangeability of ("HJJ, there is the following 
equality in distribution, 

:= c((H„)) - i-Hn) ■■= biin'J) 
which holds for all fixed bijections c : Z H> Z and 6 : N i— > Z. It follows that 



Now choose c so that c{b{j)) — j for j ~ 1, . . . ,n. It is straightforward to check that 



■Hr 



H'n almost surely for this choice of c. Note finally that 



nr 



{{lAj)n[n] -.Ije [n]}US([n]) 



by Proposition [5| This establishes the claim. □ 
Proof of Proposition | j7| From Lemma |18[ Proposition [5] and Lemma [19] we have 

Gn = {(iAj)«n[n] :i<0,jeZ}US([n]) 
= {(iAj)«n[n] :z,jeZ}US([n]) 
= {{iAj)n[n]:i,j e[n]}UEi[n]) 

where (i A j)-^ denotes the MRCA of i and j in {Hn)- □ 
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5 Proof of Theorem [3] 



Proof. Suppose WLOG that (Hn) is the hierarchy derived from a real tree T and an 
exchangeable family {tj , j > 1) of random elements of T having directing measure p. We 
may further suppose that T is embedded in £i by a stick-breaking procedure as in the 
proof of Theorem [T] or as in Example of Section |3.1| For fc S N let tt^ be the orthogonal 
projection onto the the span of the first k standard basis elements of £i, 

TTk{{xi,X2, ■ . ■)) = (a;i,...,Xfe,0,0, ...)■ 
We will define a map ^ : [0, 1] i-^- T such that for every fc > 1, and every point x £ Tk ■= 

{TTk{x) : X gT}, 

(a) £.~^{Fx{T)) is an interval, 

(b) the Lebesgue measure of £,~^{Fx{T)) equals p{Fx{T))- 

To that end, for fc > 1 let pk be the image of p under tt/j. The branches of 7fc can 
be visualized as strings, and atoms of pk can be visualized as heads on these strings. 
With this imagery, (7fe+i,Pfe+i) is derived as if by selecting a bead of pk, crushing this 
bead into a series of smaller beads, and then drawing these smaller beads out onto the 
new string Tk+i \ Tk, possibly leaving some mass at the location of the crushed atom. 
(There is also the possibility that {Tk+i,Pk+i) ~ {Tk,Pk), but this may be ignored.) Let 
: [0, 1] ^ Ti be 

• an increasing map, meaning that x < y implies < 

• such that pi is the image of Lebesgue measure under . 

Now, 72 is derived as if by selecting an atom of pi, crushing it, and stringing the crushed 
bits in the 62 direction. If a is the selected atom of pi, then £,i^{{a}) is an interval (u, v] 
or [u,v] in [0,1]. We may therefore define a modification ^2 of ^i: in such a way that 

• ^2 agrees with ^1 off of (u, v] (or [u, v] as the case may be) 

• ^2 sends (u, v) onto a U 72 \ 7i 

• u<s<t<v implies ||6(s)ll < II6WII 

• P2 is the image of Lebesgue measure under ^2- 

With this type of construction we can establish the existence of a family of maps {£,k,k > 
1), such that 

1. for all x e [0, 1], TTk{^k+i{x)) = S,k{x) 

2. if ^kix) = ^kiy) and x <y, then ^k+iix) < C/c+i(y) 

3. Pfe is the image of Lebesgue measure under ^k- 
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It is straightforward to show that the hmit ^ Unifc_yoo Cfe exists Lebsegue a.e. and has 
the asserted properties (a) and (b). Now set 

i^r\F,{T)) : X eljrjj UEiiO, I]). 

For (Uj) an IID sequence of uniform [0,1] random variables independent of T and n > 1 
let 

K — {{j e[n]:UjeB}:Be.^ 

and let 

K ■■= {{.] e N : m,) e F,{T)} : X e r} u 

It is easily seen that — H'^ almost surely. Also, conditionally given {T,p), the 
sequence {^{Ui), . . . ,S,{Un)) is an IID sequence of points with common distribution p. An 
argument such as can be found in the proof of Proposition 15 shows that if a; S T\ IJj. 71- 
then {j G [n] : £,{Uj) G Fx{T)} is with probability one either empty or a singleton. It 
follows that 



(K) ' inn). 



□ 



6 Complements 
6.1 Properties of p and (Hn) 

Let denote the following class of subsets of the closed interval [0,3], 

JT: = {(0,l),(l,2),(2,3)}u|u {(l^,^) :0<J<2"-1 

U {(2,a;) : 2 < X < 3}US([0,3]). 

Let {Un,n > 1) be an iid sequence of Uniform[0,3] random variables, and define an 
exchangeable hierarchy on N by 

Hn {{j e [n] -.Uj eB}:Be }. (28) 

Figure |4] shows the graph T„ of Hn for large n, omitting leaf labels. Let us describe 
a few key features of this graph T„ and relate them to Jif. 

• The root of T„ has degree three. The three vertices vi, V2, connected to the root 
correspond to the three subintervals (0, 1), (1, 2), (2, 3) of [0, 3] contained in J^. 

• The graph of T„ exhibits recursive binary splitting below the vertex vi; this is a 
consequence of the recursive binary splitting of (0, 1) in J^. 

• The graph of T„ looks star-like or broomstick-like below V2; this is because 
contains no nonsingleton subsets of (1,2). 
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Figure 4: Graph of the hierarchy defined by (28 1 with leaf labels omitted. 



• The graph of T„ looks like a comb or a caterpillar below v^; this is because 
contains a family of subsets of (2, 3) of the form (2, x) for a; in a dense subset of 
(2,3). 

From this example, one might make the following naive conjecture. 

Naive Conjecture: The three phenomena exhibited by {Tin) and its graph 
- infinite recursive splitting, finite splitting, and comblike erosion - are the 
basic building blocks out of which every exchangeable hierarchy is made. 

However, we have difficulty seeing how to make this conjecture more precise: comblike 
erosion can be interspersed with recursive splitting, splits need not be binary, and a 
countable family of splits can precede another countable family of splits, and it may be 
that this latter family of splits is not well-ordered by containment. It is easy to imagine 
pathological examples of hierarchies. 

In lieu of a precise form of the conjecture, we offer the following propositions, which 
in conjunction with Theorem [T] represent an effort at proving something like the Naive 
Conjecture. But first, supposing that {Tin) is a hierarchy on N, we set 

GG«„ 

ieG,G/{i} 

and call q;„(j) the parent of j in Hn', it is easily checked that an{j) G "Hn+j for all 
n,j > 1- 

Also, for ("Hn) a hierarchy on N and i, j e N we write i ^ j if either i — j ot for all 
n > max{i, j}, 

• S A j)n = "nO'), and 

• if w is in {i} U (i A j)„ \ and v € [n] then 

ctn{u) — ctniv) implies u = v. 

Less formally, we write i j for distinct i,j G N if for every n > max{i,j}, the graph 
of Hn looks like a comb in the neighborhood of i and j, and i is "lower down" in this 



26 



comb than is j. Next, let 11 be the partition of N derived by putting i and j in the same 
block if and only if either i ^ j or j ^ i. We say that 11 is the comb-partition of (Hn) 
and the blocks of 11 are the comb components of {T-Ln)- It is easily checked that if ("Hn) 
is an exchangeable random hierarchy on N then the comb-partition is exchangeable. 

Proposition 20. Suppose that {T,p) is a random weighted real tree, that (tj) an ex- 
changeable sequence directed by p, and that (Hn) the exchangeable hierarchy on N 
derived from T and {tj ) . On the event that 

• there is a segment [[u,v]] ofT that is oriented towards the root ofT, with v further 
from the root, meaning that [[u, z;]] C [[0,w]] 

• such that [[u,v]] does not sprout any branches of positive p-mass, meaning that for 
all X in the support of p with x ^ 

u £ [[0,2;]] implies v € [[0,x]] 

• such that pa{[[u,v]]) — and pd{[[u,v]]°) > 0, where pa and pd are the atomic and 
diffuse components ofp, respectively, and [[u,v]]° denotes the interior of[[u,v]\ for 
the topology of T 

the set {j : tj g [[u,w]]°} is a subset of one of the comb- components of (T-Ln)- Conversely, 
on the event that distinct positive integers i and j lie in the same comb- component of 
{"Hn), there is with probability one a segment [[u, v]] ofT having the properties above for 
which ti, tj G [[u, v]]° . 

The statement of the proposition is obvious from definitions. 

Proposition 21. Suppose that {T,p) is a random weighted real tree, that (tj) an ex- 
changeable sequence directed by p, and that [T-Ln) is the exchangeable hierarchy on N 
derived from T and {tj). On the event that a is an atom ofp, for all distinct pairs u,v 
in the set B = {j : tj = a}, (uAv) = B, where {uA v) denotes the MRCA of u and v in 
the hierarchy derived from {tj) and T ■ Furthermore, tj is an atom of p if and only if 

< lim — 7^{fc G [n] : am{]) ~ a,n{k) for all m > max{j, fc}}, 

n— ^oo 71 

and iftj is an atom ofp then p{{tj}) equals this limit above almost surely. 

The proof of this proposition is elementary and is therefore omitted. 

Remark. The atomic and diffuse parts Pa and pd of the random measure p of Theorem [T] 
are "invariants," loosely speaking, of the exchangeable hierarchy {Tin) of that theorem. 
More formally, Pa and pd are measurable functions of p, within the standard abstract 
setup for random measures [53l Chapter 1]. 

6.2 EPPFs, EHPFs, etc. 

Throughout this section, let C := ljfe>i^'' denote the set of compositions of positive 
integers. 
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Suppose that 11 is an exchangeable random partition of N and for n > 1 let n„ = 11 

denote the the restriction of 11 to [n]. Then it is straightforward to show that there is 
a symmetric function p : C i— > [0, 1] such that for any partition 7r„ = {Bi, . . . , B^} of [n] 
into disjoint blocks Bi, . . . , 

P(n„ =^„) =p(#Bi,...,#Bfe) (29) 

where (#i?i, . . . , #Bfe) are the sizes of the blocks of 7r„. Both the number of blocks and 
the sequence of block sizes can be regarded as functions of 7r„. Blocks of a partition are 
conventionally ordered by least elements, or alternatively by size, but this is immaterial 
for the present discussion because symmetry of p means that the order in which block 
sizes are presented does not matter. More formally, symmetry of p means that for every 
/c > 1, 

p(Ai, . . . , Afc) = p(A<^(i), . . . , A^(fc)) (30) 

for every element (Ai, . . . , Afc) of C and every permutation a of [k]. Another property of 
p comes from the consistency of n„ as n varies: by exchangeability, the distribution of 
n„ is the same as the distribution of the partition derived from n„+i by first relabeling 
the contents of the blocks of n„-|_i using a uniform random permutation of [n + 1] and 
then restricting the resulting partition to [n]. In purely algebraic terms this translates 
as the following addition rule: 

p(Ai, . . . , Afc) = p(Ai + 1, . . . , Afc) + . . . + p(Ai, . . . , Afc + 1) + p(Ai, . . . , Afc, 1). (31) 

There is also the trivial normalization condition 

p(l) = 1. (32) 

We say that p ; C i— > [0, 1] is an exchangeable partition probability function (EPPF) if p 



satisfies ( 30 ) - ( 32 ) ; an easy construction shows that if p is an EPPF then there is an 



exchangeable random partition 11 of N for which (29) holds for all n > 1. EPPFs are 
therefore in 1-1 correspondence with distributions of exchangeable partitions. For more 
on EPPFs see [SS] and ^ Chapters 2 & 3]. 

According to a result in [63], if p is an EPPF then there is a sequence (Pi, P2, . . .) of 
nonnegative random variables for which 



p(Ai,...,Afc) = E 



\i=l / \i=2 



(33) 



holds for all A S C. The EPPF p determines the joint law of this sequence (Pi) uniquely, 
and conversely. There is also the following consequence of Theorem [2} if 11 is an ex- 
changeable random partition of N for which the sequence (Pi) of ([33]) satisfies 



P {M := inf{?i : P„ = 0} > 1 and if M < 00 then Pm+i = for aU i > 0) = 1 

or equivalently, if H is an exchangeable random partition of N that almost surely does 
not contain any singleton blocks, then there is a measure fi on the Kingman simplex 
V := {(xi,X2, . . .) : xi > a;2 > . . . , J2i>i ^» ^ 1} which for all A S [jk>i 

p{X) = / mxix)fi{dx), (34) 
Jv 
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where m\ is the monomial symmetric polynomial 



for A — (Ai, . . . , Afc), where the sum is taken over all injective functions a : [k] t-^ N. The 



relation between the (Pi) of (33) and the measure of (34) can be succinctly described: 
jj, is the distribution of the rearrangement of (Pi) in nonincreasing order. It can therefore 
be seen that this /i is therefore supported on the set 

{xe V:^a;,>i = l}. (36) 

i 

For proofs of the material in this exposition, see (6^ and [621 Chapters 2 & 3]. 

It is natural to ask whether there is an analogous story for exchangeable hierarchies 



on N. We are unable to find a "hierarchies counterpart" to (33), i.e. a moment formula 
which relates every exchangeable hierarchy to a the moments of a family of random 
variables. But the remaining formulae above all have analogues in the exchangeable 
hierarchies setting. Making these analogies explicit involves the use of a new (to us) 
family of symmetric polynomials. 

Recall from the introduction that the graph of a hierarchy "Hn on [n] is a rooted 
tree T„ with n leaves, where each leaf bears a distinct label in [n]. (The graph of a 
hierarchy also lacks non-root internal vertices of degree two, and furthermore lacks edge 
lengths and orientation of edges at vertices vertex.) We define the shape of such a tree 
T„ to be its orbit under the action of the symmetric group, the action of a permutation 
a being to relabel leaf i by a{i) for every i g [n]. We use lower-case bold face s„ to 
denote the shape of T„, which can be regarded as a function of T„ or alternatively of Tin, 
Sn = s(T„) = s{y,n)- Obviously, the shape of T„ can be identified with the unlabeled 
tree derived by erasing the labels on the leaves of T„, but this observation is not far from 
a tautology, as unlabeled graphs are often defined as orbits under such actions of the 
symmetric group. 

We write s„ s„+i if there is a hierarchy Hn+i on for which s„ = s('H„+i ) 

[n] 

and s„+i ~ s('H„+i). Equivalently (regarding the shapes s„ and s„_|_i as unlabeled 
graphs), we write s„ s„+i if it is possible to remove a leaf of s^+i and thereby obtain 
s„. In this context, removing a leaf means (i) erasing the leaf and the edge of s„_|_i which 
had that leaf as an endpoint and then (ii) erasing any nonroot internal vertices of the 
resulting graph that happen to have degree two - such a vertex appears if the parent of 
the erased leaf happens to have degree 3 in s„+i. 

Let S := {s(H„) : Tin a hierarchy on [n] for some n > 1}. With these definitions the 



following analogues of (30) - (32) are obvious: if (H„) is an exchangeable hierarchy on 



N, then there is a function /i : S i— >■ [0, 1] for which for every fixed hierarchy H on [n], 

V{nn = H) = h{siH)) (37) 
;^(s„)= ^ /i(s„+i) (38) 

Sn + i:s„/'s„+i 

h(s(E 



(s(S([l]))) = 1. (39) 
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Here, is the trivial hierarchy on [1]. Note that (29) and (30) together say that 

P(n„ — 7r„) is a function of the equivalence class of 7r„, two partitions of [n] being 
equivalent if and only if they have the same block sizes. Formula (37) likewise asserts 
that V{Hn — H) only depends on the equivalence class of H , i.e. the shape of the graph 
of H . To strengthen the analogy between (38) and (31 ) we may write A A for A, A e C 
to mean 

A = (Ai,... , Afe) and A € {(Ai + 1, . . . , A^), (Ai, . . . , Afc + 1), (Ai, . . . , Afc, 1)} 



and rewrite (31 ) accordingly. We remark that the two relations on S and on C that are 
both denoted both turn these respective spaces into graded graphs or lattices in the 
sense of that the problem characterizing exchangeable hierarchies on N (solved 

by Theorem [l] of the present work) and the problem of characterizing exchangeable 
partitions of N (solved by Kingman's Theorem [2]), are equivalent to characterizing the 
classes of bounded, positive harmonic functions on these lattices. See [55j Chapter 0] for 
much more on this topic. 



Proposition 22. Suppose that ft, : S H> [0, 1] satisfies (38) for all n> 1 and (39). Then 
there is an exchangeable random hierarchy {T-Ln) on N for which (31) holds for every fixed 
hierarchy H on [n], for all n > I. 



Proof. Let Hi — ^([1]), and assuming that Hi, 
given Hn — H, select Hn+i from the set 



, Hn have been defined, conditionally 



hierachies H' on [n + 1] : H' 



= H 



selecting H' with probability h{H')/h{H). 



□ 



As previously remarked, we have no analogue of (33), but we can write down a 



hierarchies counterpart to (34). Towards this end, we introduce the following family of 



symmetric polynomials. First, let M[a;A,A G C] denote the ring of formal power series 
in (commuting) variables xx for A € C}, with coefficients in K. Then for every function 
cr : N i—> N let denote the following generalized monomial. 



Xc := Jl 2:(^(i),...,^(„)). (40) 

Next, if S is a mapping from [k] to N^, which is to say, if : N i-^- N for i e [fc], then 
let the class of S be the following hierarchy on [k] , 

class(S) ;= {{j e [k] : (Sj(l), . . . , Ej(n)) = A} : ti > 1 and A e N"} U S([fc]). 

Finally, for H a fixed hierarchy on [k], let mn be the following symmetric polynomial. 



i;:class(E)=_ff 



where the sum is taken over all E e [kf \ To see how these notions might be useful, 
suppose that Jf is a nonrandom hierarchy on [0,1] consisting of nested intervals: at the 
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top level of J^, [0,1] is partitioned into intervals of lengths wi,'W2, ■ ■ which sum up 
to 1, where we list these lengths in nonincreasing order of size. Then for every i > 1 
the interval Wi is partitioned into further subintervals of lengths i), W(i_2)j • ■ -i which 
sum up to Wi, and these lengths are likewise ordered by size. We suppose that this 
recursive process of splitting continues indefinitely: the intervals at depth D have widths 
W\ for A G , and every interval of width w\ is split into further subintervals of widths 
Wx^i, w\,2, ■ ■ ., which sum up to w\, and are ordered by size. We then set x^n) '■= w^n) 
for all n > 1 and 



Wy 



(length(A) > 2) 



where A' is the composition derived by removing the final element of A. The family 
{x\, A e C} then has the following two properties, 



(i) E„>i ^(«) = 1 aiid foJ" every A e C, Y.n>i ^^(A.n) 

(ii) > a;(2) > 
Note that for every A € C, 



= 1 



and X(^x,i) > X{\.2) > 



E ■ 

cr:Ni-j-N 
(<T(l),...,<T(n))=A 



for Xa the monomial defined by (40 1. Now let {Ui, 
variables, and define a hierarchy Jin on [n] by 



,Un) be iid uniform[0,l] random 



-Hn {{j e [n] -.UjeBj-.Be }. 

Then for every fixed hierarchy H on [n] , 

V{nn^H)^mH{x). 

If if the interval hierarchy is random then there is a probability measure fi on {xx, A G 
C} for which items (i) and (ii) above hold /i-a.s., and in this case 



= H) = h{s{H)) 



mH{x)^{dx), 



(41) 



where V* denotes the subset of for which for every x € C*, 
(i) J2n>i ^in) < 1 and for every A G C, J2n>i ^(A,n) < 1 



(ii) > a;(2) > ... and x^x,i) > a;(A,2) 



> 



Equation (41) is our analogue of (34). We make no claims on rigor, here; topological 



properties, convergence of distributions, and analytic properties of symmetric functions 
have been well-studied for the Kingman simplex V |2Il EH [39] , but we have not even 
defined a measurable structure on V*. We leave such issues for another paper. 



Equation ( 34 1 does not describe the distribution of the most general exchangeable 



partition, and likewise (41 1 does not describe the distribution of the most general ex- 



changeable hierarchy. We give two examples of exchangeable hierarchies (Hn) on N for 
which (41 ) does not describe the distribution of (Hn)- Throughout, (J7„) is an IID family 



of uniform [0,1] random variables. 
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(a) For n > 1 set H„ := {{j e [n] : Uj > x} : < x < 1}. 



(b) Let (q„) be an enumeration of the rational numbers in [0,1], and let (e„) be a 
sequence of positive numbers summing to 1/4. Let be the open subset of [0,1] 
defined by := U„>i(9n - ^n,qn + Then set 

■^n := {{j e [n] ■.Uj>x}:xe W} U 5([n]) 



Speaking loosely, (a) is problematic because there is "continuous erosion", for which (41 ) 
cannot account, and (b) is problematic because (41 1 assumes that the "recursive splits" 
are well-ordered by inclusion. However, the requirement that the interval hierarchy ^ 
exhibit infinite recursive splitting can be overcome: we can model hierarchies that 
stop splitting at finite depth by setting w\ = i) = W(a,i,i) and „) = for all 
n > 2 for some A, for example. 

We end this section by observing the following multiplication rule for our symmet- 
ric polynomials. Recall that for the usual monomial symmetric functions, there is the 
multiplication rule 

mi(x)m\{x) = 'm\{x) 



which holds for all A € C. Since r7ii(a;) = 1 on the set (36), this multiplication rule 
implies that for p defined by (34 1, p satisfies (31). The analogue in our context is the 



following: for any hierarchy H on [n], and with mi denoting the polynomial 



there is the following multiplication rule, 

mimH{x) = 



■niH^+iix). 



IH 



Since mi{x) = 1 on the set {x e V* : x satisfies (i) and (ii) directly above}, this 



multiplication rule likewise implies that for h defined by the second equality of (41), h 



satisfies (38). 



6.3 Tail measurability and open problems 

If {Hn) is a random hierarchy on N, define the tail sigma field of (Hn) as follows, 





, , 'Hn+2 




{n+1} 



{n+l,n+2} 

If {'Hn) is an exchangeable hierarchy, then the pair {T,p) of Theorem [T] is not tail- 
measurable for the following reason. Let tti{x) = {xi,0, 0, . . .) for x ^ £i. Then the image 

of p under tti is the directing measure for the exchangeable spinal variables {X^_i^^j^) 
defined by (21), where b is the bijection mentioned at the beginning of Section |4] but 
neither these spinal variables nor their directing measure are tail measurable. On the 
other hand. 
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Proposition 23. The distribution of the pair {TtP) of Theorem^^is measurable with 
respect to the exchangeable sequence {Hn) of that theorem. 

Proof. This is a direct consequence of the fact that the bijection b mentioned at the 
beginning of Section [4] can be chosen arbitrarily. □ 

Instead of proving the assertion that (T,_p) is not tail(^„)-ineasurable, we offer the 
fohowing analogy using exchangeable partitions. Suppose that is a random open 
subset of [0, 1] having Lebesgue measure one, and let (J7„) and {Vn) be independent 
IID sequences of uniform[0,l] random variables, jointly independent of . Form an 
exchangeable partition 11 of N by putting i and j in the same block of 11 if Ui and Uj 
fall in the same connected component of , and index the blocks B2, . . ■} of 11 by 
least elements, so that 1 = mini?i < min_B2 < . . .. Then the limits 

= p,(n)= lim -#B,n[n] 

exist almost surely, and Pi is the width of the interval of containing Umin Bi ■ Now 
form another open subset 'W by placing intervals of widths Pi in left-to-right order, 

^' (0, Pi) U U (Fi + . . . , F„, Pi + . . . + P„+i). 

n>l 

Then is not measurable with respect to the tail of 11, 

tail(n) ^ Oain ) , 

because Pi(n) is not measurable with respect to tail(n), but Pi(n) equals almost surely 
the length of the connected component of whose left-endpoint is zero. Our weighted 
tree {T,p) is very much like the open subset U' . 

Continuing this discussion, it is evident that if {P^,P2, . . .) is the sequence of P^'s 
ranked in nonincreasing order, and 

'^ranked := (0, P^) U \J [P^ + . . . , P„^, P/ + . . . + P,t+i), (42) 
n>l 

then '^ranked IS mcasurablc with respect to tail(n). Deterministically reranking the 
components of in nonincreasing order effectively erases the information contained 
in but not contained in tail(n). Obviously, the fact that the resulting order is by 
decreasing length is immaterial; any deterministic ordering will do. 

We may now state the previously-promised stronger version of Kingman's theorem 
and some open problems. 

Theorem 24 (j56j). Suppose that H is an exchangeable partition ofN and that the prob- 
ability .space supports a sequence (Un) of IID uniform[0,l] random variables independent 
o/n. Then there is a li-measurable random open subset of [0,1] such that if II' is the 
partition defined by 

{i and j in same block o/Il'} = {Ui and Uj in same component of , or i = j} . 
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then there is the equality of joint distributions 

(n,-^) = (n','^) (43) 

Question 1 Define an equivalence relation ~ on laws of weighted real trees, writing 

^(r,p) - =S^(r',p') if and only if {H„) = (H'J for (Hn) and {H'^) exchangeable 
hierarchies derived by sampling from {T,p) and {T' ,p'), respectively. 

Is there a nice way of telling whether or not ^{T,p) ^ J^{T' ,p')l Speaking loosely, 
it should be possible to prune away tree branches of T that carry no p-mass, and also 
stretch segments of T arbitrarily, and not change the equivalence class of ^{T,p)- Purely 
topological considerations are not quite enough to settle this question: suppose that 7i 
is the tree [0,1] rooted at 1 and pi is Lebesgue measure on [0,1], and suppose that T2 is 
is the half line [0, 00) rooted at 0, and p2 is the exponential(l) distribution on 72- Then 
Tl and 72 are not homeomorphic, but ^{T,p) ^ ^{T' ,p')- 

Question 2 Is there a nice way to select from each equivalence class of ~ above a 
unique representative of that equivalence class? Such a recipe would be akin to reorder- 
ing component intervals of open subsets of [0,1], as discussed above. By nice we mean 
measurable^ and you can pick the sigma fields, but the goal is to have an analogy of the 
strong version of Kingman's theorem involving an equality of joint distributions, as in 
(|43). 

Question 3 Repeat the previous questions in the context of Theorem [Sj i.e. with hier- 
archies on [0,1] instead of weighted real trees in £1. 

We thank David Aldous, Steve Evans, and Matthias Winkel for helpful discussion. 
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